Statistical Measures of Distance

نویسندگان

  • ARTUR JAROSZEWICZ
  • M. Roytman
چکیده

Measures of statistical distance are widely used in techniques such as clustering and classification, when we wish to identify objects that are in some sense similar to each other. The choice of distance between data points is an important one, and there are a large number of measures to choose from. In this paper we present some of the most commonly used measures of distance and compare their usefulness when analyzing data sets with different properties. They include general metrics such as the Minkowski, which includes the classic Euclidian, the Chebyshev, the Manhattan, and the Hamming distance. We present the Mahalanobis metric, which is similar to the Euclidian but corrects for strong structure in the data. In addition, we present the concept of correlation as a distance measure, covering the properties of the Pearson and Spearman correlations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Measures via Copula Functions

In applications of differential geometry to problems of parametric inference, the notion of divergence is often used to measure the separation between two parametric densities. Among them, in this paper, we will verify measures such as Kullback-Leibler information, J-divergence, Hellinger distance, -Divergence, … and so on. Properties and results related to distance between probability d...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

New distance and similarity measures for hesitant fuzzy soft sets

The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...

متن کامل

Several new results based on the study of distance measures of intuitionistic fuzzy sets

It is doubtless that intuitionistic fuzzy set (IFS) theory plays an increasingly important role in solving the problems under uncertain situation. As one of the most critical members in the theory, distance measure is widely used in many aspects. Nevertheless, it is a pity that part of the existing distance measures has some drawbacks in practical significance and accuracy. To make up for their...

متن کامل

Parameter Estimation of Some Archimedean Copulas Based on Minimum Cramér-von-Mises Distance

The purpose of this paper is to introduce a new estimation method for estimating the Archimedean copula dependence parameter in the non-parametric setting. The estimation of the dependence parameter has been selected as the value that minimizes the Cramér-von-Mises distance which measures the distance between Empirical Bernstein Kendall distribution function and true Kendall distribution functi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014